On Potts Model Clustering, Kernel K-means, and Density Estimation
نویسندگان
چکیده
Many clustering methods, such as K-means, kernel K-means, and MNcut clustering, follow the same recipe: (1) choose a measure of similarity between observations; (ii) define a figure of merit assigning a large value to partitions of the data that put similar observations in the same cluster; (iii) optimize this figure of merit over partitions. Potts model clustering, introduced by Blatt, Wiseman, and Domany (1996) represents an interesting variation on this recipe. Blatt et al. define a new figure of merit for partitions that is formally similar to the Hamiltonian of the Potts model for ferromagnetism extensively studied in statistical physics. For each temperature T , the Hamiltonian defines a distribution assigning a probability to each possible configuration of the physical system or, in the language of clustering, to each partition. Instead of searching for a single partition optimizing the Hamiltonian, they sample a large number of partitions from this distribution for a range of temperatures. They propose a heuristic for choosing an appropriate temperature and from the sample of partitions associated with this chosen temperature, they then derive what we call a consensus clustering: two observations are put in the same consensus cluster if they belong to the same cluster in the majority of the random partitions. In a sense, the consensus clustering is an “average” of plausible configurations, and we would expect it to be more stable (over different samples) than the configuration optimizing the Hamiltonian. The goal of this paper is to contribute to the understanding of Potts model clustering and to propose extensions and improvements: (1) We show that the Hamiltonian used in Potts model clustering is closely related to the kernel K-means and MNCut criteria. (2) We propose a modification of the Hamiltonian penalizing unequal cluster sizes and show that it can be interpreted as a weighted version of the kernel K-means criterion. (3) We introduce a new version of the Wolff algorithm to simulate configurations from the distribution defined by the penalized Hamiltonian, leading to penalized Potts model clustering. (4) We note a link between kernel based clustering methods and non-parametric density estimation and exploit it to automatically determine locally adaptive kernel bandwidths. (5) We propose a new simple rule for selecting a good temperature T .
منابع مشابه
The Conditional-Potts Clustering Model
A Bayesian kernel-based clustering method is presented. The associated model arises as an embedding of the Potts density for label membership probabilities into an extended Bayesian model for joint data and label membership probabilities. The method may be seen as a principled extension of the so-called super-paramagnetic clustering. The model depends on three parameters: the temperature, the k...
متن کاملAsymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data
Kernel density estimators are the basic tools for density estimation in non-parametric statistics. The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in which the bandwidth is varied depending on the location of the sample points. In this paper, we initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...
متن کاملColor Image Segmentation via Improved K-Means Algorithm
Data clustering techniques are often used to segment the real world images. Unsupervised image segmentation algorithms that are based on the clustering suffer from random initialization. There is a need for efficient and effective image segmentation algorithm, which can be used in the computer vision, object recognition, image recognition, or compression. To address these problems, the authors ...
متن کاملLearning mixtures by simplifying kernel density estimators
Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated by various means, often using Expectation-Maximization or Kernel Density Estimation. In addition to these well known algorithms, new and promising stochastic modeling methods include Dirichlet Process mixtures and k-Maximum Likelihood Estimators. Most of the method...
متن کاملWorking Paper Alfred P. Sloan School of Management Using the K-means Clustering Method as a Density Estii-lation Procedure Using the K-means Clustering Method as a Density Estimtion Procedure
A random sample of size N is divided into k clusters that minimize the within cluster sum of squares locally. This k-means clustering method can be used as a quick procedure for constructing variable-cell historgrams that have no empty cell. A histogram estimate is proposed in this paper, and is shown to be uniformly consistent in probability.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006